RAW SEQUENCE LISTING DATE: 05/18/2001 

PATENT APPLICATION: US/09/830 , 807 TIME: 11:03:12 

Input Set : A:\gje-65.txt 

Output Set: N:\CRF3\05182001\I830807.raw 



3 
4 
5 
6 
7 
- 8 
9 
11 
13 

C--> 15 
C--> 16 

18 
20 
22 
23 
24 
25 
27 
28 
29 
31 
32 
34 
36 
38 
40 
42 
44 
46 
48 
50 
52 
54 
56 
57 
59 
61 
63 
64 
65 
67 
68 
69 
71 
72 
73 
75 



<110> APPLICANT: Crooke, Helen R. 

Clarke, Enda E. 

Everest, Paul H. 

Dougan, Gordon 

Holden, David W. 

Shea, Jacqueline E. 

Feldman, Robert G. 
<120> TITLE OF INVENTION: VIRULENCE GENES AND PROTEINS, 
<130> FILE REFERENCE; GJE-65 

<140> CURRENT APPLICATION NUMBER: US/09/830,807 

<141> CURRENT FILING DATE: 2001-04-30 

<160> NUMBER OF SEQ ID NOS : 72 

<170> SOFTWARE: Patentln Ver . 2.1 

<210> SEQ ID NO: 1 . 

<211> LENGTH: 4333 

<212> TYPE: DNA 

<213> ORGANISM: Escherichia coli 
<220> FEATURE: 
<221> NAME/KEY 
<222> LOCATION 
<400> SEQUENCE 



AND THEIR USE 



(2549) 



CDS 

(1017) . 
1 

ccattactca gaatgggcgg atacacaata aaaattgttc ttcttattac 
atgccgaggc acaaaaaaat caccgatagt tttaccatcg agaatttttt 
tcagaatttt cta'aattatt tctgatacgt ttgaatatcc agacgcacag 
accactaaca ccagtaaaaa ccacaggtgt gatattaatt cccaggccaa 
ttgtcataca atgacagtcc aggccaactt tccgctttcc ctttgacgta 
ataaattgcg gcaatgtcag tagggggatg gctgttaaca tcgggatacc 
acacgtactt tccaccattt tttcaaggga tagcgtaaaa aaagcatgta 
ccggatataa cgaaaaatac ctgcatgcgg aacgagtgga tgaagtcatt 
agccataatg acggttcggc gctattcaca tgccatgtat ggctcgaata 
atatgaaaag ggatccctaa caacatcagc caggcgcgga tggagtcgag 
cgttgcgcgg gtactgggtt catatatggt taactaatct cggatttttc 
tgtcgggtta tgcctttagg cttgttgcca tagcgacacc gacctgaccg 
aggcttcaag gtttttatgc atagcatcat cgctaccact aaccagaatg 
taagacggtt gataaataaa tttgctggca aaccctacac gaagtcgatg 
taggagaagc acggaaagtg aaaacggttg caatcaggtg cttaatccat 
gctgaacgat accgggattc tgttgtcgga atggcaggtt atccattaaa 
tcgatataag cacacaaagg gggaagtgct tactaattat gaaacataaa 



atg aaa atg cgt tgg ttg agt get gca gta atg tta acc ctg 
Met Lys Met Arg Trp Leu Ser Ala Ala Val Met Leu Thr Leu 
5 10 15 

tct tea age tgg get ttc agt att gat gat gtc gca aag caa get caa 1115 
Ser Ser Ser Trp Ala Phe Ser lie Asp Asp Val Ala Lys Gin 

20 25 30 

tec tta gee ggg aaa ggc tat gag gcg ccc aaa age aac ttg ccc tec 1163 
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141 290 295 300 305 

143 aac ccg caa ggc ttc ggt eta ttg cag cgt ggt cgt gat ttc tec cgc 1979 

144 Asn Pro Gin Gly Phe Gly Leu Leu Gin Arg Gly Arg Asp Phe Ser Arg 

145 310 315 320 

14 7 ttt gaa gat etc gat gat cgt tac gat ctt cgt cca age gca tgg gtg 2027 

148 Phe Glu Asp Leu Asp Asp Arg Tyr Asp Leu Arg Pro Ser Ala Trp Val 

149 325 330 335 

151 act ccg aaa ggg gag tgg ggc aaa ggc age gtt gag ctg gtg gaa att 2075 

152 Thr Pro Lys Gly Glu Trp Gly Lys Gly Ser Val Glu Leu Val Glu He 

153 340 345 350 

155 cca ace aac gat gaa acc aac gat aac ate gtc get tac tgg acg ccg 2123 

156 Pro Thr Asn Asp Glu Thr Asn Asp Asn He Val Ala Tyr Trp Thr Pro 

157 355 360 365 

15.9 gat cag ctg ccg gag ccg ggt aaa gay atg aac ttt aaa tac acc ate 2171 

160 Asp Gin Leu Pro Glu Pro Gly Lys Glu Met Asn Phe Lys Tyr Thr He 

161 370 375 380 385 

163 acc ttc age cgt gat gaa gac aaa ctg cat gcg cca gat aac gca tgg 2219 

164 Thr Phe Ser Arg Asp Glu Asp Lys Leu His Ala Pro Asp Asn Ala Trp 

165 ' 390 395 400 

167 gtg caa caa acg cgt cgt tea acg ggg gat gtg aag cag teg aac ctg 2267 

168 Val Gin Gin Thr Arg Arg Ser Thr Gly Asp Val Lys Gin Ser Asn Leu 

169 405 410 415 

171 att cgc cag cct gac ggt act ate gee ttt gtg gtc gat ttt acc ggc 2315 

172 He Arg Gin Pro Asp Gly Thr lie Ala Phe Val Val Asp Phe Thr Gly 

173 420 425 430 

175 get gag atg aaa aaa ctg cca gag gat acc ccg gtc aca gcg caa acc 2363 

176 Ala Glu Met Lys Lys Leu Pro Glu Asp Thr Pro Val Thr Ala Gin Thr 

177 435 440 445 

179 age att ggt gat aat ggt gag ata gtt gaa age acg gtg cgt tat aac 2411 

180 Ser He Gly Asp Asn Gly Glu He Val Glu Ser Thr Val Arg Tyr Asn 

181 450 455 460 465 

183 ccg gtt acc aaa ggc tgg cgt ctg gtg atg cgt gtg aaa gtg aaa gat 24 59 

184 Pro Val Thr Lys Gly Trp Arg Leu Val Met Arg Val Lys Val Lys Asp 

185 470 475 480 

187 gee aag aaa acc act gaa atg cgt get gcg ctg gtg aat gee gat cag 2507 

188 Ala Lys Lys Thr Thr Glu Met Arg Ala Ala Leu Val Asn Ala Asp Gin 

189 485 490 495 

191 acg ttg agt gaa acc tgg age tac cag tta cct gee aat gaa 2549 

192 Thr Leu Ser Glu Thr Trp Ser Tyr Gin Leu Pro Ala Asn Glu 

193 500 505 510 

195 taagacaact gagtacattg acgcaatgcc catcgccgca agegagaaag cggcattgcc 2609 
197 gaagactgat atccgcgccg ttcatcaggc getggatgee gaacaccgca cctgggcgcg 2669 
199 ggaggatgac tccccgcaag geteggtaaa ggcgcgtctg gaacaagect ggecagatte 2729 
201 acttgetgat ggacagttaa ttaaagacga egaagggege gatcagctaa aggegatgee 2789 
203 agaagtaaaa cgctcctcga tgtttcccga cccgtggcgt accaacccgg taggccgttt 2849 
205 ctgggatcgc ctgcgtggac gcgatgtgac gccgcgctat ctggctcgtt tgaccaaaga 2909 
207 agagcaggag agtgagcaaa agtggcgtac cgtcggtacc atccgccgtt acattctgtt 2969 
209 gatcctgacg ctcgcgcaaa ctgttgtcgc gacctggtat atgaagacca ttcttcctta 3029 
211 tcaggggtgg gegctgatta atcctatgga tatggttggt caggatgtgt gggtttcctt 3089 
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213 tatgcagctt ctgccttata tgctgcaaac cggtatcctg atcctctttg cggtactgtt 3149 
215 ctgttgggtg tccgccggat tctggaccgg cgttgatggg cttcctgcaa ctgcttattg 3209 
217 gtcgcgataa atacagtata tctgcgtcaa cagttggcga tgaaccatta aacccggagc 3269 
219 atcgcacggc gttgatcatg cctatctgta acgaagacgt gaaccgtgtt tttgctggct 3329 
221 tgcgtgcaac gtgggaatca gtaaaagcca ccgggaatgc caaacatttt gatgtctaca 3389 
223 ttcttagtga cagttataac ccggatatct gcgtcgcaga gcaaaaagcc tggatggagc 344 9 
225 ttatcgctga agtcggtggc gaaggtcaga ttttctatcg ccgccgccgc cgtcgcgtga 3509 
227 agcgtaaaag cggtaatatc gatgacttct gccgtcgctg gggcagccag tacagctaca 3569 
229 tggtggtgct ggatgctgac tcggtaatga ccggtgattg tttgtgcggc ctggtgcgcc 3629 
231 tgatggaagc caacccgaac gccgggatca ttcagtcgtc gccgaaagcg tccggcatgg 3689 
233 atacgctgta tgcgcgctgt cagcagttcg cgacccgcgt gtatgggcca ctgtttacag 374 9 

23 5 ccggtttgca cttctggcaa cttggcgagt cgcactactg ggggcataac gcgattatcc 3 809 
237 gcgtgaaacc gtttatcgag cactgtgcac tggctccgct gccgggcgaa ggttcttttg 3869 
239 ccggttcaat cctytcacat gacttcgtgg aagcggcgtt gatgcgccgt gcaggttggg 3929 
241 gggtctggat tgcttacgat ctcccgggtt cttatgaaga attaccgcct aacttgcttg 3989 
243 atgagctaaa acgtgaccgc cgctggtgcc acggtaacct gatgaacttc cgtctgttcc 4049 
245 tggtgaaggg tatgcacccg gttcaccgtg cggtgttcct gacgggcgtg atgtcttatc 4109 
247 tctccgctcc gctgtggttt atgttcctcg cgctctctac tgcattgcag gtagtacatg 4169 

24 9 cgttgaccga accgcaatac ttcctgcaac cacggcagtt gttcccggta tggccgcagt 4229 
251 ggcgtcctga gctggcgatt gcactttttg cttcgaccat ggtgctgttg ttcctgccga 4289 
253 agctattgag cattttgctt atctggtgca aaggaacgaa agaa 4 3 33 

256 <210> SEQ ID NO: 2 

257 <211> LENGTH: 511 

258 <212> TYPE: PRT 

259 <213> ORGANISM: Escherichia coli 
261 <400> SEQUENCE: 2 
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360 <210> SEQ ID NO: 3 

361 <211> LENGTH: 574 

362 <212> TYPE: DNA 

363 <213> ORGANISM: Escherichia coli 

365 <400> SEQUENCE: 3 

366 ttcgttgatc ctgtcaccgt ttgttcggtt atttccagcc gtgccaccgt tggtctgcga 60 

367 accaaacgct ggaaactgtt ccctgatccc ggaagagtat tcaccgccgc aggtgctggt 120 

368 tgataccgat cggttcctt^g agatgaatcg tcaatgctcc cttgatgatg gttttatgca 180 
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